Discovering Correction Rules for Auto Editing
نویسندگان
چکیده
This paper describes a framework that extracts effective correction rules from a sentence-aligned corpus and shows a practical application: auto-editing using the discovered rules. The framework exploits the methodology of finding the Levenshtein distance between sentences to identify the key parts of the rules and uses the editing corpus to filter, condense, and refine the rules. We have produced the rule candidates of such form, A B, where A stands for the erroneous pattern and B for the correct pattern. The developed framework is language independent; therefore, it can be applied to other languages. The evaluation of the discovered rules reveals that 67.2% of the top 1500 ranked rules are annotated as correct or mostly correct by experts. Based on the rules, we have developed an online auto-editing system for demonstration at http://ppt.cc/02yY.
منابع مشابه
Identifying Correction Rules for Auto Editing
This paper describes a framework to extract the effective correction rules from the sentence-aligned corpus and show a practical application: auto-editing using the found rules. The framework exploits the methodology of finding Levenshtein distance between sentences to identify the key parts of the rules and then use the editing corpus to filter, condense and refine the rules. We produce the ru...
متن کاملMT9V126 Data Sheet
Features • Low-power CMOS image sensor with integrated image flow processor (IFP) and video encoder • 1/4-inch optical format, VGA resolution (640H x 480V) • ±2.5% additional columns and rows to compensate for lens alignment tolerances • Integrated lens distortion correction • Overlay generator for dynamic bitmap overlay • Integrated video encoder for NTSC/PAL with overlay capability and 10-bit...
متن کاملDiscovering Editing Rules For Data Cleaning
Dirty data continues to be an important issue for companies. The database community pays a particular attention to this subject. A variety of integrity constraints like Conditional Functional Dependencies (CFD) have been studied for data cleaning. Data repair methods based on these constraints are strong to detect inconsistencies but are limited on how to correct data, worse they can even intro...
متن کاملObject-Oriented Identifier Renaming Correction in Three-Way Merge
There are two traditional concurrency models among the source code management (SCM) systems: lock and merge models. The lock model prevents the concurrent modification on the same files, but the merge model allows the parallel editing, and performs a merge to reconcile the changes. A three-way merge engine is a usual part of SCM systems, some of them attempt to auto-merge the files, but sometim...
متن کاملCritique of Manuscript-Correction/ The Role of Editors in Presenting the Author: A review of Toghray Mashhadi's biography in his newly published Book of Essays, Fatima Mehri
The Role of Editors in Presenting the Author A Review of Toghray Mashhadi's Biography in His Newly Published Book of Essays Fatemeh Mehri Associate Professor of Persian Language and Literature, Shahid Beheshti University [email protected] Abstract Researchers in the field of editing and correction manuscripts consider the writing of introductions as part of the correction process. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCLCLP
دوره 15 شماره
صفحات -
تاریخ انتشار 2010